Statistical string similarity model for information linkage
نویسندگان
چکیده
منابع مشابه
A Statistical Model for Flexible String Similarity
This paper proposes a novel framework for image retrieval. The retrieval is treated as searching for an ordered cycle in an image database. The optimal cycle can be found by minimizing the geometric manifold entropy of images. The minimization is solved by the proposed method, fast active tabu search. Experimental results demonstrate the framework for image retrieval is feasible and quite promi...
متن کاملString kernels and similarity measures for information retrieval
Measuring a similarity between two strings is a fundamental step in many applications in areas such as text classification and information retrieval. Lately, kernel-based methods have been proposed for this task, both for text and biological sequences. Since kernels are inner products in a feature space, they naturally induce similarity measures. Information-theoretical approaches have also bee...
متن کاملEmploying Trainable String Similarity Metrics for Information Integration
The problem of identifying approximately duplicate objects in databases is an essential step for the information integration process. Most existing approaches have relied on generic or manually tuned distance metrics for estimating the similarity of potential duplicates. In this paper, we present a framework for improving duplicate detection using trainable measures of textual similarity. We pr...
متن کاملString Metrics and Word Similarity applied to Information Retrieval
Over the past three decades, Information Retrieval (IR) has been studied extensively. The purpose of information retrieval is to assist users in locating information they are looking for. Information retrieval is currently being applied in a variety of application domains from database systems to web information search engines. The main idea of it is to locate documents that contain terms the u...
متن کاملA Dependency Treelet String Correspondence Model for Statistical Machine Translation
This paper describes a novel model using dependency structures on the source side for syntax-based statistical machine translation: Dependency Treelet String Correspondence Model (DTSC). The DTSC model maps source dependency structures to target strings. In this model translation pairs of source treelets and target strings with their word alignments are learned automatically from the parsed and...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Progress in Informatics
سال: 2009
ISSN: 1349-8614,1349-8606
DOI: 10.2201/niipi.2009.6.7